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Abstract 

The vomeronasal organ (VNO) is an olfactory structure that detects pheromones and environmental cues. It consists of sensory 
neurons that express evolutionary unrelated groups of transmembrane chemoreceptors. The predominant V1 R and V2R receptor 
repertoires are believed to detect airborne and water-soluble molecules, respectively. It has been suggested that the shift in habitat of 
early tetrapods from water to land is reflected by an increase in the ratio of V1 RA/2R genes. Snakes, which have a very large VNO 
associated with a sophisticated tongue delivery system, are missing from this analysis. Here, we use RNA-seq and RNA in situ 
hybridization to study the diversity, evolution, and expression pattern of the corn snake vomeronasal receptor repertoires. Our 
analyses indicate that snakes and lizards retain an extremely limited number of V1R genes but exhibit a large number of V2R 
genes, including multiple lineages of reptile-specific and snake-specific expansions. We finally show that the peculiar bigenic pattern 
of V2R vomeronasal receptor gene transcription observed in mammals is conserved in squamate reptiles, hinting at an important but 
unknown functional role played by this expression strategy. Our results do not support the hypothesis that the shift to a vomeronasal 
receptor repertoire dominated by V1 Rs in mammals reflects the evolutionary transition of early tetrapods from water to land. This 
study sheds light on the evolutionary dynamics of the vomeronasal receptor families in vertebrates and reveals how mammals and 
squamates differentially adapted the same ancestral vomeronasal repertoire to succeed in a terrestrial environment. 
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Introduction 

The vomeronasal organ (VNO), or Jacobson's organ, contains 
an olfactory sensory neuroepithelium enclosed in a cartilagi- 
nous or bony capsule in contact with the base of the nasal 
cavity. It plays a major role in interindividual interactions 
(through the detection of pheromones and kairomones) and 
environmental recognition (Houck 2009; Su et al. 2009). 
Chemosensory receptor proteins are present on the microvilli 
of vomeronasal sensory neuron dendrites and interact with 
molecules in the VNO cavity. The predominant chemosensory 
receptors in the vertebrate VNO come in three classes: formyl 
peptide receptors (FPRs), V1 Rs, and V2Rs. These are evolutio- 
nary unrelated families of seven-transmembrane (7TM) 
G-protein-coupled receptors (GPCRs) (Dulac and Axel 1995; 



Herrada and Dulac 1997; Matsunami and Buck 1997; Ryba 
and Tirindelli 1997). Previous analyses of fully sequenced ver- 
tebrate genomes (Grus and Zhang 2007) indicated de novo 
appearance and expansion of V1R and V2R subfamilies in 
some lineages, as well as the presence of a significant 
number of pseudogenes in others (Grus et al. 2005; Young 
et al. 2005; Shi and Zhang 2007; Young and Trask 2007). The 
remarkable divergence between V1R and V2R gene reper- 
toires is thought to reflect the ecological niche and social 
habits of a particular species. Indeed, a large expansion of 
the V1R family accompanied by a sharp decline in the 
number of V2R genes was observed in mammals in compar- 
ison to Xenopus and fish species. Together with the notion 
that V1Rs and V2Rs might detect volatile and water-soluble 
molecules, respectively, these results led to the hypothesis that 
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this shift, reflected by the ratio of V1R and V2R functional 
genes, corresponds to the evolutionary transition of early tet- 
rapods from water to land (Shi and Zhang 2007). 

Surprisingly, the vomeronasal chemosensory receptor gene 
repertoire of snakes, a vertebrate clade that possesses a very 
large VNO (Dawley 1998), has never been studied, probably 
because of the under-representation of nonmammalian spe- 
cies among fully sequenced vertebrate genomes (Milinkovitch 
and Tzika 2007; Milinkovitch et al. 2010). The VNO of snakes 
is associated with a sophisticated tongue delivery system: The 
tongue collects chemicals in the environment by means of 
tongue flicking and transfers them to the vomeronasal open- 
ings on the palate. Several neurological and behavioral studies 
demonstrated that snakes largely depend on their vomerona- 
sal system for prey trailing, capture, and consumption, as well 
as for courtship and shelter selection (Miller and Gutzke 1 999; 
Zuri and Halpern 2003; Huang et al. 2006). Components of 
the vomeronasal receptor signal transduction pathway, 
including several G proteins and the secondary messenger 
triphosphoinositol (IP3), have been identified in the garter 
snake VNO (Luo et al. 1 994). In the same species, stimulations 
by prey-derived chemoattractants generate both transient 
activation of neurons in the vomeronasal sensory epithelium 
(Cinelli et al. 2002) and an increase in firing rates of individual 
neurons in the accessory olfactory bulb (AOB), that is, the 
projection site of vomeronasal sensory neurons in the brain 
(Jiang et al. 1990). 

Recent technological advances allow for de novo sequen- 
cing and assembly of transcriptomes (RNA-seq), that is, even 
in the absence of reference genomes (Gibbons et al. 2009; 
Schwartz et al. 2010; Surget-Groba and Montoya-Burgos 
2010; Tzika et al. 2011). Here, using RNA-seq and RNA in 
situ hybridization, we report on the diversity and evolution 
of the corn snake {Elaphe guttata, recently renamed 
Pantherophis guttatus) vomeronasal receptor repertoire. We 
found a large V2R repertoire composed of small, but multiple, 
snake-specific and reptile-specific expansions and only a few 
V1 R receptor genes. We also show punctate expression of 
V2Rs in the corn snake vomeronasal neuroepithelium, an 
expression pattern compatible with that observed in mammals 
(Martini et al. 2001; Silvotti et al. 2007; Ishii and Mombaerts 
201 1). Our results refute the hypothesis that the distribution 
of vomeronasal receptors in vertebrates is dictated by the 
evolutionary transition from water to land (Shi and Zhang 
2007) and shed light on the evolutionary dynamics of V1 R 
and V2R families in vertebrates. 

Materials and Methods 

Animals 

Corn snakes (Pantherophis guttatus) were obtained from our 
in-house breeding colony and maintained according to the 
Geneva Canton regulations (authorization 1008/3421/1 R). 



All animals were sexually mature, unless indicated otherwise, 
and 2-5 years old. 

Histology 

The VNO of one male adult corn snake was dissected, 
fixed in 4% paraformaldehyde (PFA) overnight at 4°C, and 
decalcified in 0.5 M ethylenediaminetetraacetic acid (EDTA) 
for 3 days at 4°C. Seven-micrometer paraffin sections 
were stained with hematoxylin and eosin using standard 
procedures. 

Labeling of VNO Epithelium by Retrograde Neuronal 
Tracing 

The head of a juvenile male corn snake was fixed in 4% 
PFA for 4h at 4°C. The skull was opened and 
1,V-dioctadecyl-3,3,3 / ,3 / -tetramethylindocarbocyanine per- 
chlorate (Dil) crystals (Invitrogen) were placed on top of the 
AOB and the main olfactory bulb (MOB). The head was 
embedded in 4% low-melting agarose and placed in 4% 
PFA at 37 °C for 6 weeks before it was cut sagittally. The 
VNO was dissected, embedded in OCT (optimal cutting tem- 
perature embedding medium), and frozen on dry ice. Sixteen- 
micrometer coronal sections were prepared from cryoblocks, 
counterstained with Hoechst (Invitrogen), and mounted with 
Vectashield (Vector Labs). 

cDNA Preparation and Deep Sequencing 

Immediately after sacrifice of the animals, the VNO was dis- 
sected and stored in RNA-protective solution (PrepProtect, 
Miltenyi Biotech). Following tissue disruption with a Polytron 
device (Kinematica), mRNA was extracted using the jiMACS 
kit (Miltenyi Biotech). After verification of mRNA integrity 
using a Bionanalyser (Agilent), mRNA was either directly sub- 
mitted to sequencing, or cDNA was synthesized and DSN 
normalized (i.e., enriched for less abundant transcripts using 
Kamchatka crab duplex-specific nuclease) using published 
procedures (Zhulidov et al. 2004). Normalized cDNA samples 
were polymerase chain reaction (PCR)-amplified (30 cycles), 
and the 5'-polyT cDNA synthesis adaptor was removed using 
restriction enzymes. RNA samples and normalized cDNA sam- 
ples, pooled or individually (supplementary fig. S1 and table 
S1 , Supplementary Material online), were submitted to Roche/ 
454 (Macrogen) and lllumina single-end sequencing (libraries 
were prepared according to standard protocols). The samples 
sequenced using Roche/454 were three males (M1, M2, and 
M3), pooled and not normalized; three females (F1, F2, and 
F3), pooled and normalized; and one juvenile female (F4), 
normalized. The samples sequenced using lllumina were 
three females (F1, F2, and F3), pooled and normalized, and 
one male (M4), not normalized. 
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Contig Assembly and Database Searches 

The two types of reads (Roche, 200-500 bp, and lllumina, 
114 bp) were assembled in two steps (see supplementary 
fig. S1 and table S1, Supplementary Material online). 

Step 1: Preselection Assembly 

Adapters removal and quality trimming as well as assembly of 
all Roche/454 reads (from M1-3, F1-3, and F4) were per- 
formed with SeqMan NGen v.2 (DNASTAR). Reads obtained 
from lllumina were subjected to adapter removal and quality 
trimming using the FASTX-Toolkit (http://hannonlab.cshl.edu/ 
fastx_toolkit, last accessed February 3, 2013) and assembled 
using Velvet with the Oases extension (Zerbino and Birney 
2008). Optimal k-mer lengths (match length used by Velvet) 
may differ for transcripts with different abundances (Gibbons 
et al. 2009; Surget-Groba and Montoya-Burgos 2010). To 
maximize the chances of identifying VR transcripts, we applied 
an additive k-method as described in Surget-Groba and 
Montoya-Burgos (2010). K-mer values of 51, 61, 71, and 81 
were used, followed by the removal of redundant contigs with 
CD-HIT-EST (Li and Godzik 2006). The ShortRead R package 
(www.r-project.org, last accessed February 3, 2013) was used 
for analyzing lllumina data (Morgan et al. 2009). Assembled 
Roche and lllumina contigs and unassembled Roche reads 
(singletons) with length above 200 bp were annotated using 
BLASTX, implemented in LANE runner (Tzika et al. 201 1), with 
the following criteria: minimal match length 30 amino acids 
(aa), minimal sequence identity 50%, maximal E value 0.05, 
and release 60 of the Ensembl protein database for the fol- 
lowing species: Anolis carolinensis (anole lizard), Danio rerio 
(zebrafish), Monodelphis domestica (opossum), Mus musculus 
(mouse), Ornithorhynchus anatinus (platypus), Rattus norvegi- 
cus (rat), and Xenopus tropicalis (western clawed frog). The 
anole lizard sequence that we identified as V1R (see below) 
was included in the search. Assembled lllumina contigs as well 
as assembled (contigs) and unassembled (singletons >200 bp) 
Roche/454 sequences are available at http://www.reptilian- 
transcriptomes.org (last accessed February 3, 2013) 
(supplementary files S6a and b, Supplementary Material 
online). 

Step 2: Final Assembly 

Reads contributing to contigs with a VR hit and singletons 
with a VR hit (in total 8,048 Roche and 54,191 lllumina 
reads) were pooled and assembled using SeqMan NGen v.2 
(DNASTAR). Assembled contigs were annotated using LANE 
runner (Tzika et al. 201 1) as described earlier. Although strin- 
gency of BLASTX search criteria was low, all the V2R contigs 
were identified with a very high confidence (£<10~ 9 ). Read 
alignments of potential V2R contigs were manually corrected 
for long homopolymer errors introduced by Roche/454 pyro- 
sequencing. Potential VR transcript fragments were translated 



into aa sequences and aligned using the MAFFT multiple align- 
ment program (Katoh et al. 2002). After manual editing of the 
alignment using JalView (Waterhouse et al. 2009), contigs 
spanning mostly the 3 / -UTR or with stop codons in the 
coding part of the transcript were removed. The remaining 
set of 196 potentially functional V2R transcripts are listed and 
their sequences provided, in supplementary file S1, Supple- 
mentary Material online. Sequences with stop codons in the 
coding sequence are listed and their sequences provided, in 
supplementary file S2, Supplementary Material online. 
Visualization of contig statistics and properties (fig. 2) was 
performed using the R software (www.r-project.org, last 
accessed February 3, 2013). 

Estimation of Transcript Number and Variability 

To estimate the number of distinct V2R transcripts and their 
sequence variability, we used the 350 aa of the alignment, 
which correspond to the 9-cysteine domain and the 7TM 
domain. We divided the alignment into four segments 
(fig. 2D) and, within each segment, we counted the number 
of unique sequences among those with less than 50% missing 
data. We aligned the corresponding nucleotide sequences 
and counted the number of unique nucleotide sequences. 
To calculate sequence variability, we divided the aa alignment 
into eight segments and computed pairwise identity among 
sequences with no missing data. 

PCR Validation of Assembled Contigs 

PCR primer pairs were designed for each of the 25 selected 
contigs. cDNA samples from the VNO of one male and one 
female corn snakes were prepared as described earlier, pooled 
and used as templates for PCR. Products of the expected sizes 
were amplified for 24 of the 25 PCR assays and submitted to 
capillary Sanger sequencing: 23 of the 24 sequences could be 
aligned to the original corresponding contig. Percentages of 
nucleotide differences were calculated over the validated 
sequence length (supplementary fig. S2, Supplementary 
Material online). Sequences of the 23 primer pairs used for 
validation (amplification and sequencing) of the predicted 
sequences are listed in supplementary table S3, Supplemen- 
tary Material online. 

Potential Python V2R Genes 

We performed Basic Local Alignment Search Tool (BLAST) 
analyses using our newly identified corn snake V2R transcripts 
as input, as well as V2Rs from other vertebrates, to detect V2R 
genes in the Burmese python {Python molurus bivittatus) draft 
genome (Castoe et al. 201 1). Our TBLASTX analyses revealed 
the presence of 2 1 6 partial sequences of likely V2R genes that 
were also compared with the nonredundant National Center 
for Biotechnology Information (NCBI) database to confirm 
their annotation (supplementary file S3, Supplementary 
Material online). For each of these shotgun-sequenced frag- 
ments of the Python genome, the longest open reading frame 
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(ORF) was identified using the ORF finder of NCBI (supplemen- 
tary file S4, Supplementary Material online). Among these, 
114 ORFs (longer than 140 aa) were selected and aligned 
with known V2Rs. This alignment indicated that the majority 
of the Python V2R sequences correspond to the variable N- 
terminus as only 34 aligned with the 7TM domain. 

Phylogenetic Analyses 

aa sequences of the 7TM region (265 aa) of 66 V2R corn 
snake contigs spanning at least 60% of the 7TM region 
were aligned with 67 representative V2R sequences of zebra- 
fish, frog, anole lizard, mouse, and opossum using MAFFT 
(Katoh et al. 2002), followed by manual adjustment using 
JalView (Waterhouse et al. 2009). An alignment with the 
addition of 1 5 Python ORFs was also performed. GTR20 was 
selected as the best aa-substitution model using the Akaike 
Information Criterion implemented in MetaPIGA-v2.1 (http:// 
www.metapiga.org (last accessed February 3, 2013) [Helaers 
and Milinkovitch 2010]). Maximum likelihood (ML) trees were 
computed using the Metapopulation Genetic Algorithm 
(MetaGA [Lemmon and Milinkovitch 2002]) implemented in 
MetaPIGA-v2.1 (Helaers and Milinkovitch 2010). We used 
probability consensus pruning among four populations of 
four individuals each and estimated posterior probability dis- 
tribution of possible trees by performing replicated metaGA 
searches and stopping when the mean relative error values 
among 10 consecutive consensus trees remained below 2%. 
ML trees were also inferred using RaxML (v7.2.6) with 500 
rapid bootstrap replicates (Stamatakis 2006). Finally, MC 3 ana- 
lyses under a Bayesian framework were performed using 
MrBayes (v3.2) (Huelsenbeck et al. 2001), and posterior prob- 
abilities were estimated after 2 million generations (burn in = 
7,000 trees sampled every 100 generations). The RaxML 
topology is shown in figure 3 (branches with bootstrap sup- 
port below 50% are collapsed), and RaxML, MetaPIGA, and 
MrByes branch supports are indicated above the branches of 
major clades. 

Identification of Lizard and Snake V1R Sequences 

Representative V1 R nucleotide sequences of zebrafish, opos- 
sum, mouse, platypus, rat, and frog were used as input of 
TBLASTX queries against the anole genome (Ensembl release 
64). This led to the identification (by "best-reciprocal hit") of a 
single Anolis sequence (ENSACAG00000025768) as a V1R 
homolog. The presence of the corresponding sequence in 
the genomic DNA of the anole lizard was confirmed by 
PCR, followed by Sanger sequencing. Known V1R nucleotide 
sequences of various vertebrates along with the newly identi- 
fied Anolis V1 R gene were also compared (TBLASTX using 
LANE runner) with the fully sequenced genome of the 
Burmese python {Python molurus bivittatus), as well as the 
multiorgan transcriptome of the garter snake (Thamnophis 
sirtalis). Although no significant hit was retrieved from the 



queries against the garter snake database, two potential 
Python V1R sequences were identified as two independent 
shotgun-sequenced fragments of the species genome 
(Python V1r1— GenBank: AEQU01 0364851 .1 and Python 
V1rb— GenBank: AEQU01 037681 4.1; from the whole- 
genome shotgun sequencing project AEQU000000000.1). 
Searching our corn snake VNO transcriptome with TBLASTX 
did not reveal any V1 R homologs. On the other hand, two 
slightly different copies of each Python V1 ra1 and V1 r2 were 
successfully amplified from Pantherophis guttatus genomic 
DNA using primers based on the Python sequences (supple- 
mentary file S7, Supplementary Material online). The distances 
of the two copies to each other and to several vertebrate V1 Rs 
were computed using MetaPIGA-v2.1 (Helaers and Milinko- 
vitch 2010), both for the nucleotide (with Jukes-Cantor cor- 
rection) and the aa sequences (with Poisson correction). The 
lizard and snake V1Rs thus identified were translated and 
aligned with their mouse, fish, and frog homologs. The 
poorly aligned regions of the 438-aa MAFFT alignment were 
trimmed with the "strict" criterion of trimal (Capella-Gutierrez 
et al. 2009) implemented into MetaPIGA-v2.1 (Helaers and 
Milinkovitch 2010), and a final alignment of 230 aa per 
sequence was phylogenetically analyzed as described earlier 
for the V2R alignment. 

mRNA In Situ Hybridization 

Templates for probes were amplified (primers listed in supple- 
mentary table S3, Supplementary Material online) from a 
mixed sample of VNO cDNA from male and female corn 
snakes. Digoxigenin (DIG) and fluorescein (FLUO) probes 
were synthesized according to the DIG and FLUO RNA labeling 
kits protocols (Roche). Sixteen-micrometer cryosections were 
prepared from freshly frozen VNO of male and female corn 
snakes. Sections were fixed in 4% PFA for 20 min at room 
temperature. Single- and double-staining in situ hybridizations 
were performed using published protocols (Riviere et al. 
2009). Hybridizations were carried out at 62 °C for 14h. 
Anti-DIG antibody coupled to alkaline phosphatase (AP) 
and anti-FLUO antibody coupled to horseradish peroxidase 
(POD) were used (Roche). FastRed (Sigma) and biotinyl-tyra- 
mide (PerkinElmer) were used as substrates for AP and POD, 
respectively. Biotinyl-tyramide was detected by streptavidin- 
conjugated Alexa Fluor 488 (Invitrogen). Sections were coun- 
terstained with Hoechst (Invitrogen) and mounted with 
Vectashield (Vector Labs). Low-magnification images were 
taken with a standard fluorescent microscope (Zeiss). High- 
magnification images were taken with a confocal microscope 
(Leica). The number of cells expressing receptors and the total 
number of cells in the sensory epithelium were manually 
counted and used to calculate the percentage of expressing 
neurons in supplementary figure S5, Supplementary Material 
online. 
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Data Availability 

All developed tools and supplementary files, Supplementary 
Material online, are available at http://www.reptilian-transcrip 
tomes.org (last accessed February 3, 2013). 

Results and Discussion 

The Corn Snake VNO 

The corn snake, suggested as one of the most convenient 
reptilian model species (Milinkovitch and Tzika 2007), was 
selected for the investigation of the vomeronasal chemorecep- 
tor repertoire of snakes (fig. 1A and B). First, to identify the 
sensory part of the vomeronasal epithelium in the corn snake 
VNO, we retrogradely labeled sensory neurons by applying a 
lipophilic neuronal tracer (Godement et al. 1987) on the AOB 
of a fixed brain. Figure 1 C-E indicates that the majority of cells 
present in the vomeronasal epithelium are neurons whose 
axons project to the AOB and whose dendrites protrude 
into the lumen of the VNO cavity. The corn snake VNO neu- 
roepithelium is organized in columns (fig. IB and £), like in 
other snake species (Wang and Halpern 1 980; Taniguchi et al. 
2000). The unlabeled cells in the basal zone of the 
epithelium likely represent neuronal stem cells. At the edges 
of the epithelium, we observe groups of labeled cells that 
possibly represent a pool of immature neurons, as their den- 
dritic projections into the lumen are not labeled (arrowheads, 
fig. 1D). 

The Snake VNO Chemoreceptor Transcriptome 

We isolated vomeronasal mRNA from four male and four 
female individuals, of which all but one female were sexually 
mature. Some of the produced cDNA samples were normal- 
ized to enrich for less abundant transcripts. We submitted two 
normalized and two non-normalized samples to Roche/454 
and lllumina sequencing (see supplementary table S1 , Supple- 
mentary Material online, for sequencing and assembly statis- 
tics). In total, we obtained 343,062 reads of 400 bp mean 
length (Roche/454) and 54,394,908 reads of 114 bp 
(lllumina). The high number of lllumina reads allowed for 
greater depth coverage, whereas the longer Roche/454 
reads reduced the chances of chimeric assemblies. Following 
a two-step approach (supplementary fig. S1, Supplementary 
Material online), we first generated contigs separately from 
the two types of reads and identified (using BLASTX against all 
protein-coding sequences of seven vertebrate species) the 
snake contigs likely to encode for vomeronasal receptors. 
Second, the reads contributing to these contigs were pooled 
into a single set of 62,239 sequences that were assembled 
into 467 mixed-read contigs. Out of these, 196 contigs were 
identified as fragments of likely functional snake V2R tran- 
scripts based on high-confidence BLASTX similarity hits 
(£<10~ 9 ) against V2R transcripts from other species (fig. 
20 and the absence of stop codons in the protein-coding 



part of the sequence (the 7TM and cysteine domains) (supple- 
mentary file S1 , Supplementary Material online). We also iden- 
tified 39 contigs representing putative pseudogenes as they 
included stop codons leading to a translation termination 
before the end of the seventh transmembrane domain (sup- 
plementary file S2, Supplementary Material online). We did 
not observe any large insertions or deletions in any of these 
putative V2Rs or pseudogenes. To validate our results, we 
selected 25 contigs for PCR amplification and sequencing. 
Among these, 92% (23/25) could be amplified, and their 
sequences were identical or nearly so with the corresponding 
target sequence (average percentage of nucleotide differ- 
ences =0.01 2; supplementary fig. S2, Supplementary Material 
online). These few differences between target and amplified 
sequences were attributed to transcriptome sequencing 
errors, amplification of closely related gene family members, 
or polymorphisms between individuals used for sequencing 
and validation. Taken as a whole, our validation demonstrates 
that the majority of assembled contigs represent genuine 
transcripts. 

Both the mean and median lengths of snake V2R contigs 
were approximately 500 bp (fig. 2A). Hence, most of them 
represent a fragment of a full transcript (the approximate 
mean length of a V2R coding sequence in vertebrates is 
2, 100 bp). V2R sequences corresponding to the N-terminus 
are under-represented in our data set for at least two reasons: 
1) The low evolutionary conservation of the V2R N-terminal 
region makes homology assignment difficult using BLASTX 
searches against the genomes of other vertebrates and 2) 
cDNA libraries tend to be biased against 5' sequences. To 
estimate the minimum number of distinct transcripts present 
in our assembly, we first translated the 1 96 putative V2R snake 
transcripts into aa sequences. Next, we generated multiple 
alignments of nucleotide (nt) and aa sequences with MAFFT 
(Katoh et al. 2002). We divided the alignments into four seg- 
ments (of 228-279 nt or 76-93 aa). In each of these segments, 
we counted the number of unique sequences (fig. 2D); note 
the N-terminus high variability of V2R sequences. The third and 
fourth segments of the alignment, located in the more con- 
served 7TM domain, indicated that our 196 snake V2R 
sequences expressed in the corn snake VNO correspond to a 
minimum of 116 distinct transcripts, encoding 109 different 
proteins. 

The suggestion of a large and variable V2R repertoire in 
snakes is also supported by our TBLASTX analyses (supplemen- 
tary file S3, Supplementary Material online) against 1) the 
Burmese python {Python molurus bivittatus) draft genome 
(Castoe et al. 2011), in which we identified 216 partial 
sequences of potential V2R genes and 2) a garter snake 
(Thamnophis elegans) multiorgan transcriptome (from brain, 
gonad, heart, kidney, liver, spleen, and blood tissues but not 
olfactory tissue [Schwartz et al. 2010]) in which we identified 
38 potential V2R genes. 
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Fig. 1. — The VNO of the corn snake. (A) Corn snake. (B) Coronal section of a corn snake VNO stained with hematoxylin and eosin (HE). Full organ (left), 
scale bar 200 |im; close-ups of the columnar sensory epithelium (top right) and the zone in contact with the outside world (bottom right), scale bars 20 \im; 
SE, sensory epithelium; NE, nonsensory epithelium; L, lumen. (0 Schematic representation of a head hemi-section with a picture of the fluorescence detected 
in the VNO and the main olfactory epithelium (MOE) after application of a retrograde tracing dye onto the AOB and main olfactory bulb (MOB). (D,f) Coronal 
section of the VNO with fluorescence (red) detected after application of the tracing dye onto the AOB. Arrowheads and arrows in (D) indicate groups of 
potentially immature neurons at the edges and base, respectively, of the neuroepithelium; arrows and dotted frame in (£) indicate dendritic projections to the 
lumen. DNA is stained with Hoechst (blue). Scale bar in (D), 200 urn and in (£), 50 firm. 



Reptile-Specific Expansion of V2Rs 

To infer the evolutionary history of snake V2Rs, we performed 
phylogenetic analyses of snake V2R aa sequences with homo- 
logs from various vertebrate species. A multiple alignment was 
built among a selection of 67 sequences from Anolis caroli- 
nensis, Mus musculus, Monodelphis domestica, Xenopus tro- 
picalis, and Danio rerio, as well as 66 of our corn snake 
sequences (only sequences that spanned at least 60% of 
the 7TM domain were used; supplementary fig. S3, Supple- 
mentary Material online). Note that the majority of these corn 
snake sequences have an average depth coverage above 10 
(fig. IB). The consensus topology shown in figure 3 is sup- 
ported by three different heuristics for large phylogeny infer- 
ence: the metapopulation genetic algorithm (Lemmon and 
Milinkovitch 2002) implemented in metaPIGA-v2 (Helaers 
and Milinkovitch 2010), the rapid bootstrap analysis carried 
out with RaxML (Stamatakis 2006), and MC 3 Bayesian 



estimation performed with MrBayes (Huelsenbeck et al. 
2001). The gene-family tree includes the three well-described 
mammalian V2R subfamilies A, B, and D, as well as the more 
distantly related subfamily C (Yang et al. 2005; Grus et al. 
2007; Young and Trask 2007). Our analyses indicate that 
family C is also present in nonmammalian vertebrates but is 
represented by a single member both in the anole lizard and 
the corn snake. The V2R gene tree also includes a large reptile- 
specific clade whose topology indicates that 1) a significant 
portion of the snake V2R repertoire arose before the split of 
this lineage from other squamates (i.e., several monophyletic 
groups include both corn snake and anole sequences) and 2) 
snakes experienced further evolution of their V2R repertoire 
but in the form of multiple small expansions (of which five are 
indicated in fig. 3). Given the short size of the V2R alignment 
(265 aa/sequence), increasing the number of sequences in the 
alignment will tend to reduce the robustness of branches 
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during phylogeny inference. We anyway built a second align- 
ment with the addition of 1 5 Python putative V2R fragments 
spanning at least 60% of the 7TM domain. Phylogenetic ana- 
lyses of this larger data set (supplementary fig. S4, Supplemen- 
tary Material online) yield a topology highly similar to that 
shown in figure 3, with Python sequences clustering with 
corn snake sequences (including one Python sequence in 
family C; supplementary fig. S4, Supplementary Material 
online). 

It is most likely that the clear separation between mamma- 
lian, frog, and squamate non-C clades reflects phylogenetic 
signal rather than long-branch attraction or homogenization 
through gene conversion (Mallon et al. 2004; Ezawa et al. 
2006). The latter hypothesis could be investigated through 
analysis of synteny conservation and patterns of sequence 
divergence among closely related paralogs (Sawyer 1999). 

Expression Patterns of V2R Subfamilies Are Conserved 
among Vertebrates 

The expression of snake V2R genes was investigated with in 
situ hybridizations on corn snake VNO sections (fig. 4) using 
probes corresponding to putatively functional V2R transcripts 
(indicated with arrows in fig. 3: V2R-EG001, EG107, EG108, 



and EG154). We also analyzed the expression pattern of 
one identified pseudogene {V2R-EG204). At least three 
snakes were analyzed for each receptor gene, and similar 
patterns of expression were observed among individuals (sup- 
plementary fig. S5, Supplementary Material online). All tested 
transcripts were detected in both males and females. For each 
of the five V2R genes, we detected expression in the vomer- 
onasal sensory neuroepithelium (fig. 4) and not in the MOE. 
Similar to what has been reported for mice (Herrada and 
Dulac 1997; Matsunami and Buck 1997; Ryba and Tirindelli 
1997), we observed punctate signals (e.g., V2R-EG107 and 
EG 154) corresponding to single neuron expression, with a 
cytoplasmic localization of V2R mRNA. As observed in other 
vertebrate species, the probability of expression of a given V2R 
from non-C subfamilies is gene dependent. This probability is 
high for V2R-EG108 and lower for V2R-EG1 07/1 54/204 
(figs. 4 and 5). We note that we cannot exclude that distinct, 
but very closely related, receptor transcripts were codetected 
by some probes. 

One striking feature of V2Rs in the mouse VNO is that 
subfamily-C members are broadly expressed and are coex- 
pressed in the same cells with non-C (ABD) V2Rs (Martini 
et al. 2001; Silvotti et al. 2007; Ishii and Mombaerts 2011). 
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The functional relevance of this peculiar pan- and coexpres- 
sion pattern, if any, is unknown but could reflect that these 
receptors work as heterodimers with one monomer widely 
expressed and the other specifically expressed. Interestingly, 
we find that the corn snake subfamily-C gene (V2R-EG001) is 
expressed in the majority of the vomeronasal sensory neurons 
(fig. 4 and supplementary fig. S5, Supplementary Material 
online). To test for monogenic expression of non-C and C 
V2Rs, we performed double-staining in situ hybridizations. 
Figure 5A indicates that V2R-EG154 and V2R-EG107, that 
is, two close members of the non-C family of snake V2Rs 
(fig. 3), show mutually exclusive transcription. In contrast, 
the snake gene belonging to subfamily C {V2R-EG001) is 



coexpressed with members of non-C subfamilies, such as 
V2R-EG1 54 {f\g. 5B). 

V1 R versus V2R Receptor Repertoires 

No V1R sequences were found in our snake vomeronasal 
transcriptome, even under relaxed criteria of BLASTX similarity 
searches. To further investigate the potential lack of V1R 
genes in corn snakes and other Sauropsida reptiles, we per- 
formed TBLASTX searches against the fully sequenced gen- 
omes of the anole lizard (Ensembl release 64) and the Burmese 
python (Castoe et al. 201 1), as well as the multiorgan tran- 
scriptome of the garter snake (Schwartz et al. 201 0). Although 
no lizard V1R gene is predicted in the Ensembl database 
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Fig. 4. — Expression of V2R transcripts in the snake vomeronasal sen- 
sory epithelium. In situ hybridizations (green) of coronal sections of a 
female corn snake VNO, with antisense RNA probes for one pseudogene 
(V2R-EG204) and four likely functional V2R transcripts. DNA is stained with 
Hoechst (blue). Scale bar, 20 ^im. 



(whereas 39 genes are annotated as V2Rs), our analyses 
recognized as a V1R a single anole previously unidentified 
protein-coding sequence. We confirmed the presence of this 
sequence in the anole lizard genome by PCR amplification 
followed by sequencing. Our in silico analyses also identified 
two potential V1 Rs (called here Python V1r1 and V1r2) in the 
Python draft genome, and their presence was confirmed by 
PCR amplification from Python genomic DNA. On the other 
hand, no significant match was obtained by searching the 
multiorgan garter snake transcriptome. We then designed 
PCR primers based on the Python V1R sequences that were 
used to probe the corn snake genomic DNA. This approach 
led to the identification of four V1 R genes. Based on the 
distances among corn snake and Python sequences (supple- 
mentary table S2, Supplementary Material online), it is likely 
that these four corn snake sequences (called Pantherophis 
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Fig. 5. — Monogenic versus nonmonogenic expression of V2Rs. In situ 
hybridizations with antisense RNA probes for (A) two ABD-subfamily 
members V2R-EG154 (green) and V2R-EG107 (red), (B) one C (V2R- 
EG001; red), and one ABD (V2R-EG154; green) subfamily members; coex- 
pression of V2Rs is only observed between family C and family A/B/D 
members. Left and middle panels show single channels in green and 
red, and right panel shows the merge plus Hoechst staining of DNA in 
blue. Scale bar, 10 urn. 



V1r1-4) correspond to two alleles or close paralogs for each 
of the two V1 Rs. 

We also performed phylogenetic analyses of a 230-aa data 
set including representatives of the 1 2 V1 R mouse subfamilies 
(Rodriguez et al. 2002), the 19 known frog V1Rs (Date-lto 
et al. 2008), V1Rs from teleost fish (Pfister et al. 2007), the 
detected anole lizard V1 R along with additional fish and frog 
homologs (as provided by Ensembl release 66), and the Python 
and corn snake V1 Rs identified above. The resulting ML phy- 
logenetic tree indicates that the snake V1 Rs (along with their 
orthologs from lower vertebrates) belong to two distant 
lineages of paralogs (fig. 6), one of which includes the anole 
V1 r. Hence, the reptilian V1 Rs are not reptilian acquisitions. In 
situ hybridizations of corn snake VNO sections revealed 
vomeronasal transcripts of Pantherophis V1r3 and 4 that 
were absent or present depending on the individual exam- 
ined, whereas no Pantherophis V1r1 or 2 transcription was 
detected. The reason for this variability is unknown but may 
be correlated with seasonal expression because positive stain- 
ing was observed during the mating period but not during 
hibernation (supplementary fig. S6, Supplementary Material 
online). Additional experiments are required to statistically 
determine whether this is true. 

Furthermore, we searched our snake VNO transcriptome 
data sets for other GPCR classes. We did not find any 
sequences with significant similarity to trace amine-associated 
receptors (TAARs) (Liberies and Buck 2006) or FPRs (Riviere 
et al. 2009) both of which are expressed in the murine olfac- 
tory system. We, however, found several olfactory receptor 
(ORs) transcripts (data not shown) that may have originated 
from a minor contamination of our samples by snake MOE. To 
test this possibility, we used a Pantherophis OR probe (Locus_ 
18943 — supplementary file S5, Supplementary Material 
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online), corresponding to a transcript that was identified in our 
sequence data set, and found punctate transcription in the 
MOE and no transcription in the VNO (data not shown), con- 
sistent with the idea of MOE contamination. 

Taken together, our results strongly suggest that squamate 
reptiles (snakes and lizards) have a very limited repertoire of 
functional V1 Rs. This makes the ratio of squamate V1 R versus 
V2R sequences very similar to the ones observed in amphi- 
bians and teleosts, in contrast to mammals whose genomes 
encode more V1 Rs than V2Rs (fig. 2£). 

Conclusions 

We report on the identification of the vomeronasal receptor 
repertoire in snakes (i.e., the lineage with the most developed 
VNO among all vertebrates), as well as the analysis of its 
expression pattern and evolutionary history. Our analyses 
demonstrate that 1) snakes exhibit a large V2R repertoire 
but a very limited number of V1R genes and 2) the peculiar 
V2R expression pattern observed in the VNO of mice (mono- 
genic expression for V2R ABD-subfamily members and broad 
expression of C-subfamily members) is conserved in snakes. 



Our deep-sequencing analyses suggest that the corn snake 
genome contains more than 116 V2R genes. Although we 
cannot exclude that some reconstructed transcripts combine 
polymorphisms among individuals, this number is likely to 
reflect a minimal estimation of the snake V2R repertoire 
because 1) we identified 196 high-quality partial V2R 
sequences that potentially represent distinct transcripts (thus 
making the snake V2R gene repertoire the second largest 
known after Xenopus), 2) we identified 216 potential V2R 
genes in the Python draft genome, 3) our transcriptome 
approach, contrary to genomic surveys, might have missed 
poorly transcribed and temporally regulated VR genes, and 
4) some VR genes may be expressed in nonolfactory structures 
and be absent from olfactory neurons. Our phylogenetic ana- 
lyses indicate the presence of reptile-specific and also of 
snake-specific clades of V2R genes (fig. 3) that may reflect 
specialization of specific snake receptors to particular ligands. 
This specialization is, however, unlikely to be a rule because 
electrophysiological recordings in garter snakes showed that 
individual neurons of the vomeronasal sensory epithelium 
respond to multiple classes of stimuli, including different pep- 
tides and aa (Inouchi et al. 1993). The snake vomeronasal 
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neuroepithelium is elaborate and highly developed, and 
almost all VNO sensory neurons express V2Rs. One might 
have expected a correspondingly large chemoreceptor gene 
repertoire. In fact, we find that the snake repertoire is quite 
sizable but not significantly larger than the one found in 
rodents. This shows that the large size of the snake VNO is 
not the result of a need to host a larger chemoreceptor 
repertoire but suggests that this organ is under another selec- 
tive pressure. 

The snake large V2R repertoire is accompanied by a surpris- 
ingly small number (2-4) of V1 R genes. Note that expression 
of a G-protein subunit specifically associated with V1 R recep- 
tors in mammals (Gi2alpha) has previously been detected in 
vomeronasal epithelial cells of garter snakes (Luo et al. 1994). 
Our own analyses of the snake V2R expression patterns also 
support the existence of only a few snake vomeronasal sen- 
sory neurons devoted to the expression of V1R receptors 
because 1) in mice, V1Rs and V2Rs expressing neurons are 
distributed in distinct zones of the VNO and are never coex- 
pressed in sensory neurons, 2) the expression patterns of V2Rs 
observed in rodents are largely conserved in snakes (figs. 4 and 
5), and 3) a V2R C-subfamily homolog is expressed in nearly all 
cells of the snake VNO sensory epithelium (fig. 4, supplemen- 
tary fig. S5, Supplementary Material online). The very limited 
number of V1 R genes in the snake genome could be the result 
of two different evolutionary histories. The first one would be 
characterized by a small repertoire of V1Rs in the tetrapod 
ancestor followed by an expansion of V2Rs in squa mates 
and an expansion of V1Rs in mammals, both after the split 
between the two lineages. A second potential history could 
involve a pre-existing large V1R repertoire that would have 
contracted in squamates and resulted in a few remnants 
today. This latter possibility should have left traces, that is, 
V1R pseudogenes in squamates. Our analysis of the Anolis 
genome, which did not reveal V1 R pseudogenes, supports a 
history that lacked V1 R expansions. 

Our analyses of the snake VNO transcriptome elucidates 
how the repertoire of vomeronasal receptors evolved in 
different lineages of terrestrial vertebrates. Both fully aquatic 
and semiaquatic amphibians possess a well-developed VNO 
(Scalia et al. 1991; Eisthen 2000). Similar to mammals, they 
have large vomeronasal receptor repertoires, such as the frog 
Xenopus tropicalis, whose genome encodes 330 V2R and 21 
V1R genes (Shi and Zhang 2007). Therefore, the common 
ancestor of extant amphibians and amniotes possessed a 
VNO and had both V1R and V2R receptors (Swaney and 
Keverne 2009). We detected four V1 R-like sequences in the 
corn snake genome, two in the Python genome and a single 
V1R gene in the anole lizard genome. Therefore, contrary to 
the situation observed in mammals, the V1 R repertoire did not 
expand in squamate reptiles. This is inconsistent with the 
hypothesis that the expansion of the V1R family arose in 
response to the transition of early tetrapods from detecting 
water-soluble to identifying air-borne ligands (Shi and Zhang 



2007). Hence, mammals and squamate reptiles developed 
different strategies for the detection of ligands via their 
VNOs: In mammals, the number and diversity of the V1R 
family members are high (accompanied by a V2R repertoire 
of variable size), whereas squamate reptiles rely almost entirely 
on their V2R repertoire. Although the ability of mammals to 
detect airborne ligands (i.e., without contact with the source 
of the ligand) with their VNO remains controversial (Luo et al. 
2003), we suggest that snake V2Rs are very likely able to 
detect airborne molecules because 1) our data show that 
the snake repertoire of vomeronasal receptors is largely domi- 
nated by V2Rs and 2) an increase in tongue flicking frequen- 
cies (delivering molecules to the VNO) has been observed in 
response to airborne chemical stimuli (Zuri and Halpern 2003). 
Therefore, we argue here that adaptation and expansion of 
the V2R repertoire in squamate reptiles for the detection of 
ligands was facilitated by the development, especially in 
snakes and allies, of a sophisticated tongue delivery system 
allowing for both volatile and nonvolatile molecules to effi- 
ciently reach the VNO. 

Our phylogenetic analyses of V2R sequences, as well as our 
in situ hybridization experiments in corn snakes, indicate that 
the dichotomy between C-subfamily and ABD-subfamily 
members exists in all vertebrates investigated so far. Indeed, 
similar to what was previously reported in fishes, amphibians, 
and mammals (Yang et al. 2005), at least one family-C 
member is present in squamate reptiles as well. Our analyses 
further indicate that the expression pattern of the C-subfamily 
of V2Rs (expression in most VNO sensory neurons and coex- 
pression with subfamily-ABD members) is conserved in squa- 
mate reptiles, hinting at an important functional role played by 
this coexpression strategy. This role is still elusive but could 
reflect a necessary heterodimerization for receptor function 
as is observed for other GPCRs. 

In recent years, evolutionary dynamics of vomeronasal 
receptor repertoires have been extensively studied using fully 
sequenced mammalian genomes. Shifts in vomeronasal 
receptor repertoires have been associated with adaptations 
to different habitats and/or life histories, as well as to the 
development of social structures (Grus et al. 2005, 2007; 
Young et al. 2005; Young and Trask 2007; Wang et al. 
2010). Our study opens new perspectives for the exploration 
of vomeronasal receptor repertoires in Sauropsida reptiles, a 
group for which an increasing number of new model species 
are being developed (Milinkovitch and Tzika 2007; Tzika and 
Milinkovitch 2007) and genomic/transcriptomic data are 
emerging (Schwartz et al. 2010; Alfoldi et al. 2011; Castoe 
et al. 201 1 ; Tzika et al. 201 1 ; St John et al. 201 2). This verte- 
brate lineage includes Testudines (turtles), Lepidosauria (the 
tuatara and squamates), and Archosauria (crocodiles and 
birds). It comprises (after exclusion of the 10,000 species of 
birds) twice as many species as mammals, and it exhibits 
adaptations to very diverse habitats ranging from dry deserts 
to almost exclusive riverine or marine waters. Moreover, 
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morphological, behavioral, and electrophysiological studies 
point to a broad usage of the vomeronasal system not only 
in snakes but also in lizards and turtles (Mason and Parker 
2010). Evolutionary and functional analyses of vomeronasal 
receptor repertoires in multiple Sauropsidia reptiles will 
uncover how reptiles and mammals differentially adapted 
the same ancestral chemoreceptor toolkit to exploit the ter- 
restrial environment. 

Supplementary Material 

Supplementary files S1-S7, tables S1-S3, and figures S1-S6 
are available at Genome Biology and Evolution online (http:// 
www.gbe.oxfordjournals.org/). 
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